Of goals and habits.

نویسنده

  • Nathaniel D Daw
چکیده

In the television series TheWire, addicts Bubbles and Johnny regularly engage in bizarre and elaborate schemes to obtain drugs, ranging from robbing an ambulance to stealing drugs by lowering a fishing line from a rooftop. That these fictional crimes can be both so meticulously planned and yet focused on such narrow, shortsighted goals highlights a gap in our understanding of how the real brain deploys deliberative vs. automatic mechanisms to make decisions. On a standard account, people can deliberatively evaluate the consequences of candidate actions, which gives us our flexibility to dream up novel plans. Alternatively, the brain can crystallize repeatedly successful behaviors into habits: learned reflexes that free up resources by executing the behaviors automatically (although at the expense of inflexibility and, it is believed, underpinning pathological compulsions). As with most dichotomies, the problem with this view is that the world is not so black and white. Much as the drug-seeking behavior of addicts seems not to fit into either category, for healthy behaviors also, neither of these two sorts of decision making is very practical on its own. In PNAS, Cushman and Morris suggest a hybrid of these mechanisms, and show behavioral evidence that humans use it to plan actions (1). The study of Cushman and Morris (1) draws on recent advances using computational models of learning to make these strategies explicit enough that their hallmarks can be measured in choice behavior. Decisions are often modeled as determined by one’s estimates of the rewards expected from different options. There are many different methods to compute these decision variables. Deliberative planning can be formalized by an algorithmic family, called “model-based reinforcement learning” (2), which evaluates candidate sequences of actions much like a chess computer does, by exhaustively searching the “tree” of their future consequences, generated using a learned model of the task contingencies (like the rules of chess or the map of a maze) (Fig. 1A). The key feature of habits, in this view, is that they instead rely on a simpler summary of the end results of this computation, such as the overall long-run reward expected following some action (2). This precomputation gives them both their simplicity and inflexibility. These summaries do not actually need to be derived from exhaustive computations using a model, but instead can be learned directly—although slowly—“modelfree” from experience (3). Cushman and Morris (1) propose a hybrid of these strategies, in which each system simplifies the problem faced by the other (Fig. 1B). The hypothesis is that, in a multistep decision problem, model-free learning selects a goal or subgoal, then model-based planning figures out how to get there. If I wanted to travel to Paris, a goal might be the airport; in chess, it might be forking the opponent’s queen. This is a useful division of labor because, in general, finding the best action using a model-based search is too complicated because of the many possible trajectories of future action. However, given a single goal, finding a good path to it can be more manageable. Meanwhile, although model-free learning is computationally simple, it learns slowly in large, multistep tasks. Narrowing its attention to a smaller, more abstracted problem—choice between a few candidate goals— lets its simplicity shine. The key to the experiments of Cushman and Morris (1) is that these strategies learn in different ways from experience in trial-anderror decision tasks (4). Indeed, just how behavior changes following a single, carefully arranged trial of experience can reveal things like whether a goal was updated, and if so how. The authors (1) harnessed the efficiency of online data collection to zero-in on these rare informative choices across a large number of subjects. In the studies (1), participants chose between actions, which led to intermediate situations represented by different colors, and then to monetary reward. If a choice that led to the blue intermediate goal was followed by a large pay-off, subjects tended to choose actions leading to blue again later. This was true even if the later action needed to get to blue was a different or even completely novel one, which suggests that planning to get to blue was model-based. This is because model-based planning, by evaluating actions in terms of their predicted consequences, can generalize across different ways to achieve the same result (5). If I want to get to the airport, model-based learning can find a route there from anywhere, whereas model-free learning can only repeat previously successful routes. Surprisingly, a rewarded choice for blue was repeated even if the subjects in the study (1) had failed to achieve the intended blue goal on the initial choice, so that this reward was irrelevant to the true value of the goal. This apparent mistake is exactly what modelfree selection predicts, because it assesses the success of a strategy (here, choosing blue) in terms of the reward received. Four experiments $$ $ ¢

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How do habits guide behavior? Perceived and actual triggers of habits in daily life

a r t i c l e i n f o What are the psychological mechanisms that trigger habits in daily life? Two studies reveal that strong habits are influenced by context cues associated with past performance (e.g., locations) but are relatively unaffected by current goals. Specifically, performance contexts—but not goals—automatically triggered strongly habitual behaviors in memory (Experiment 1) and trig...

متن کامل

A new look at habits and the habit-goal interface.

The present model outlines the mechanisms underlying habitual control of responding and the ways in which habits interface with goals. Habits emerge from the gradual learning of associations between responses and the features of performance contexts that have historically covaried with them (e.g., physical settings, preceding actions). Once a habit is formed, perception of contexts triggers the...

متن کامل

The Effect of Mobile Social Networking on the Reading Habit of the Student Teachers of Frahangian University in South Khorasan Province

Purpose: Since today the use of mobile phones is extremely widespread, and individuals with different age groups and social classes devote a considerable amount of time to using these social networks, training the use of these networks can provide a platform for improving reading habits among young people. Therefore, the purpose of this study was to investigate the effect of Telegram messenger ...

متن کامل

How do people adhere to goals when willpower is low? The profits (and pitfalls) of strong habits.

Across 5 studies, we tested whether habits can improve (as well as derail) goal pursuit when people have limited willpower. Habits are repeated responses automatically triggered by cues in the performance context. Because the impetus for responding is outsourced to contextual cues, habit performance does not depend on the finite self-control resources required for more deliberative actions. Whe...

متن کامل

Habits as knowledge structures: automaticity in goal-directed behavior.

This study tested the idea of habits as a form of goal-directed automatic behavior. Expanding on the idea that habits are mentally represented as associations between goals and actions, it was proposed that goals are capable of activating the habitual action. More specific, when habits are established (e.g., frequent cycling to the university), the very activation of the goal to act (e.g., havi...

متن کامل

The habitual consumer

Consumers sometimes act like creatures of habit, automatically repeating past behavior with little regard to current goals and valued outcomes. To explain this phenomenon, we show that habits are a specific form of automaticity in which responses are directly cued by the contexts (e.g., locations, preceding actions) that consistently covaried with past performance. Habits are prepotent response...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 112 45  شماره 

صفحات  -

تاریخ انتشار 2015